group size
- North America (0.13)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
- North America > Canada > Quebec > Montreal (0.05)
- North America > United States (0.04)
- Europe > United Kingdom (0.04)
- (2 more...)
A PAC-Bayesian Generalization Bound for Equivariant Networks
Equivariant networks capture the inductive bias about the symmetry of the learning task by building those symmetries into the model. In this paper, we study how equivariance relates to generalization error utilizing PAC Bayesian analysis for equivariant networks, where the transformation laws of feature spaces are determined by group representations. By using perturbation analysis of equivariant networks in Fourier domain for each layer, we derive norm-based PAC-Bayesian generalization bounds. The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error and thereby provide a guideline for selecting them. In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
R2Q: Towards Robust 2-Bit Large Language Models via Residual Refinement Quantization
Chen, Jiayi, Shi, Jieqi, Huo, Jing, Wu, Chen
The rapid progress of Large Language Models (LLMs) has brought substantial computational and memory demands, spurring the adoption of low-bit quantization. While 8-bit and 4-bit formats have become prevalent, extending quantization to 2 bits remains challenging due to severe accuracy degradation. To address this, we propose Residual Refinement Quantization (R2Q)-a novel 2-bit quantization framework that decomposes the process into two sequential 1-bit sub-quantizations, forming an adaptive quantization lattice. Extensive evaluations on Llama, OPT, and Qwen across diverse benchmarks-covering question answering, commonsense reasoning, and language modeling-demonstrate that R2Q consistently outperforms existing 2-bit quantization methods in both fine-grained and coarse-grained settings. By refining quantization through a residual learning mechanism, R2Q enhances performance, improves training stability, and accelerates convergence under extreme compression. Furthermore, its modular design enables seamless integration with existing quantization-aware training (QAT) frameworks.
- North America > United States > Virginia (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > New York > Rensselaer County > Troy (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
Enhancing Group Recommendation using Soft Impute Singular Value Decomposition
Ibrahim, Mubaraka Sani, Saidu, Isah Charles, Csato, Lehel
The growing popularity of group activities increased the need to develop methods for providing recommendations to a group of users based on the collective preferences of the group members. Several group recommender systems have been proposed, but these methods often struggle due to sparsity and high-dimensionality of the available data, common in many real-world applications. In this paper, we propose a group recommender system called Group Soft-Impute SVD, which leverages soft-impute singular value decomposition to enhance group recommendations. This approach addresses the challenge of sparse high-dimensional data using low-rank matrix completion. We compared the performance of Group Soft-Impute SVD with Group MF based approaches and found that our method outperforms the baselines in recall for small user groups while achieving comparable results across all group sizes when tasked on Goodbooks, Movielens, and Synthetic datasets. Furthermore, our method recovers lower matrix ranks than the baselines, demonstrating its effectiveness in handling high-dimensional data.
- North America > United States > Massachusetts > Suffolk County > Boston (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Africa > Nigeria > Federal Capital Territory > Abuja (0.04)
- (4 more...)
Life-cycle Modeling and the Walking Behavior of the Pedestrian-Group as an Emergent Agent: With Empirical Data on the Cohesion of the Group Formation
Albeaik, Saleh, Alrished, Mohamad, Alsallum, Faisal
This article investigates the pedestrian group as an emergent agent. The article explores empirical data to derive emergent agency and formation state spaces and outline recurring patterns of walking behavior. In this analysis, pedestrian trajectories extracted from surveillance videos are used along with manually annotated pedestrian group memberships. We conducted manual expert evaluation of observed groups, produced new manual annotations for relevant events pertaining to group behavior and extracted metrics relevant group formation. This information along with quantitative analysis was used to model the life-cycle and formation of the group agent. Those models give structure to expectations around walking behavior of groups; from pedestrian walking independently to the emergence of a collective intention where group members tended to maintain bounded distance between each other. Disturbances to this bounded distance often happened in association with changes in either their agency or their formation states. We summarized the patterns of behavior along with the sequences of state transitions into abstract patterns, which can aid in the development of more detailed group agents in simulation and in the design of engineering systems to interact with such groups.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
Egiazarian, Vage, Castro, Roberto L., Kuznedelev, Denis, Panferov, Andrei, Kurtic, Eldar, Pandit, Shubhra, Marques, Alexandre, Kurtz, Mark, Ashkboos, Saleh, Hoefler, Torsten, Alistarh, Dan
The recent hardware-accelerated microscaling 4-bit floating-point formats such as MXFP4 and NVFP4, supported on NVIDIA and AMD GPUs, promise to revolutionize large language model (LLM) inference. Yet, their practical benefits remain unproven. We present the first comprehensive study of MXFP4 and NVFP4 for post-training quantization, revealing gaps between their promise and real-world performance. Our analysis shows that state-of-the-art methods struggle with FP4, due to two key issues: (1) NVFP4's small group size provably neutralizes traditional outlier mitigation techniques; (2) MXFP4's power-of-two scale quantization severely degrades accuracy due to high induced error. To bridge this gap, we introduce Micro-Rotated-GPTQ (MR-GPTQ), a variant of the classic GPTQ quantization algorithm that tailors the quantization process to FP4's unique properties, by using block-wise Hadamard transforms and format-specific optimizations. We support our proposal with a set of high-performance GPU kernels that enable the MR-GPTQ format with negligible overhead, by rotation fusion into the weights, and fast online computation of the activations. This leads to speedups vs. FP16 of up to 3.6x layer-wise, and 2.2x end-to-end on NVIDIA B200, and of 6x layer-wise and 4x end-to-end on RTX5090. Our extensive empirical evaluation demonstrates that MR-GPTQ matches or outperforms state-of-the-art accuracy, significantly boosting MXFP4, to the point where it can near the accuracy that of NVFP4. We conclude that, while FP4 is not an automatic upgrade over INT4, format-specialized methods like MR-GPTQ can unlock a new frontier of accuracy-performance trade-offs.
- Europe > Austria > Vienna (0.14)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony
Lu, Han, Liu, Zichen, Xiong, Shaopan, He, Yancheng, Gao, Wei, Wu, Yanan, Wang, Weixun, Liu, Jiashun, Li, Yang, Zhao, Haizhou, Huang, Ju, Yang, Siran, Li, Xiaoyang, Luo, Yijia, Liu, Zihe, Pan, Ling, Yan, Junchi, Wang, Wei, Su, Wenbo, Wang, Jiamang, Qu, Lin, Zheng, Bo
Synchronous Reinforcement Learning (RL) post-training has emerged as a crucial step for enhancing Large Language Models (LLMs) with diverse capabilities. However, many systems designed to accelerate RL post-training still suffer from low resource utilization and limited scalability. We present ROLL Flash, a system that extends ROLL with native support for asynchronous RL post-training. ROLL Flash is built upon two core design principles: fine-grained parallelism and rollout-train decoupling. Guided by these principles, ROLL Flash provides flexible programming interfaces that enable a fully asynchronous training architecture and support efficient rollout mechanisms, including queue scheduling and environment-level asynchronous execution. Through comprehensive theoretical analysis and extensive experiments, we demonstrate that ROLL Flash significantly improves resource utilization and scalability over synchronous RL post-training. ROLL Flash achieves up to 2.24x speedup on RLVR tasks and 2.72x on agentic tasks, using the same GPU budget as synchronous baselines. Furthermore, we implement several popular off-policy algorithms and verify that asynchronous training can achieve performance on par with synchronous training.
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- (2 more...)